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A  NOTE  ON  THE  CONCEPT  OF  UNCERTAINTY  AS  APPLIED  IN  PSYCH OIX5GI CAL  RESEARCH 

Raymond  S.  Nickers on 


As  defined  by  Shannon,  the  average  amount  of  infomation  (or  uncertainty) 
associated  with  the  occurrence  of  an  event  with  n  possible  outcomes  is 

n 


H(x)  =  -  l  p(x^)  log 2  p(x^) 


(1) 


where  p(x^)  is  the  probability  of  the  outcome 

noth  the  uncertainty  measure  and  the  associated  conceptual  framework  have 
been  major  determinants  of  many  recent  trends  in  experimental  psychology.  Several 
tutorial  expositions  of  theory,  and  reviews  of  related  empirical  studies  are 
available  (e.g.,  Attncave,  1959  ;  Luce,  I960  ).  The  purpose  of  this  note  is  to 
distinguish  several  possible  connotations  of  H  as  it  has  been  used  in  the 
psycho  logical  literature. 

1.  Perhaps  the  most  straight  forward  use  of  uncertainty  is  as  a  measure 
of  non-metric  variability .  In  this  sense  it  is  a  statistical  property  of  a 
categorized  data,  and  is  given  by 

n  H(x-)  IK: rJ 

H,,(x)=  -  l  - —  logo  - —  (2) 

i=l  T  T 

where  lJ(x.)  represents  the  number  of  occurrences  of  the  outcome  x .  and  T  is 

%  ^ 

the  total  number  of  events  in  the  sample.  Uncertainty  analysis  is  similar  in 
some  respects  to  variance  analysis  and  has  the  advantage  that  it  can  be  applied 
to  any  set  of  data  which  is  nominal  or  can  be  reduced  to  nominal  from  (Gamer  & 
McGill,  1956).  The  use  of  uncertainty  as  a  descriptive  statistic  requires  no 
assumptions  concerning  underlying  distributions,  sampling  procedures,  or  a 
priori  subjective  probabilities  of  subjects. 


As  defined  by  Shannon,  the  average  amount  of  information  (or 
uncertainty)  associated  with  the  occurrence  of  an  event  with  n 
possible  outcomes  is 


n 

E(x)  -  -  l  p(x.)  log2  p(x^)  (1) 

i-1 


where  p(x^)  is  the  probability  of  the  outcome,  x.£. 

Both  the  uncertainty  measure  and  the  associated  conceptual  framework 
have  been  major  determinants  of  many  recent  trends  in  experimental 
psychology.  Several  tutorial  expositions  of  theory,  and  reviews  of 
related  empirical  studies  are  available  (e.g.,  Attneave,  1959;  Luce, 
1960).  The  purpose  of  this  note  is  to  distinguish  several  possible 
connotations  of  H  as  it  has  been  used  in  the  psychological  literature. 

1.  Perhaps  the  most  straight  forward  use  of  uncertainty  is  as  a 
measure  of  non-metric  variability.  In  this  sense  it  is  a  statistical 
property  of  a  set  of  categorized  data,  and  is  given  by 


*  Nfx^  ii  (xi) 
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(2) 


where  N(x^)  represents  the  number  of  occurrences  of  the  outcome  x^ 
and  T  is  the  total  number  of  events  in  the  sample.  Uncertainty 
analysis  is  similar  in  some  respects  to  variance  analysis  and  has  the 
advantage  that  it  can  be  applied  to  any  set  of  data  which  is  nominal  or 
can  be  reduced  to  nominal  form  (Garner  &  McGill,  1956).  The  use  of 
uncertainty  as  a  descriptive  statistic  requires  no  assumptions  concerning 
underlying  distributions,  sampling  procedures,  or  a  priori  subjective 
probabilities  of  subjects. 
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2.  The  term  also  has  been  used  to  connote  a  parameter  of  a 
theoretical  probability  distribution,  which  is  not  actually  measured, 
but  is  given  by  definition,  or  inferred  from  sample  statistics.  This 
usage  will  be  denoted  by 

*  -  l  PJxi>  l°9s  PJh)  (V 

K  i=l  y  v 

where  pn (x.)  represents  the  defined,  or  inferred,  probability  of  the 

r  * 

outcome  x-.  The  uncertainties  associated  with  a  "fair"  toss  of  an 
"unbiased"  coin,  and  with  a  roll  of  a  perfect  die,  are  by  definition  1 
and  2.58  bits  respectively.  Whether  or  not  there  are  such  things  as  an 
unbiased  coin  or  a  perfect  die  is  irrelevant.  Even  if  there  were, 
however,  we  would  expect  that  with  a  finite  sample  of  tosses  or  rolls, 
the  statistic  would  be  somewhat  less  than  the  parameter  H^(x) 

since  all  possible  outcomes  are  unlikely  to  occur  with  exactly  equal 
frequency  in  either  case.  With  small  samples  of  events  with  several 
possible  outcomes  the  difference  between  9^(x)  and  Ils(x)  can  be 
quite  large.  Miller  (1955)  has  shown  that  with  samples  drawn  from  a 

distribution  with  the  parameter  H  (x)  ,  in  general  H  (x)  will  be  smaller 

P  s 

than  Hp(x)  by  an  amount  proportional  to  the  number  of  different  possible 
outcomes  and  inversely  proportional  to  the  number  of  observations  in  the 
sample. 

3.  Frequently  in  psychological  experiments,  the  stimulus  selection 
procedure  actually  used  by  E  is  not  strictly  consistent  with  the  information 
given  to  S  concerning  the  probabilities  associated  with  the  outcomes  which 
could  occur.  For  example,  S  may  be  told  that  each  of  k  stimuli  is  equally 
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likely  to  occur  on  each  trial  of  the  experiment,  whereas  the  stimulus 
selection  procedure  involves  constraints  such  as  the  forcing  of  an  equal 
number  of  occurrences  of  each  alternative  during  some  segment  of  an 
experimental  session,  or  the  avoidance  of  runs  exceeding  some  predetermined 
length.  We  will  represent  average  uncertainty,  as  implied  by  a  selection 
procedure,  as 

Hc(x)  =  ~n  l  Pn{s£}  l°32  {si}  (4) 

t=l 

where  pn  {s^}  is  the  probability  of  the  occurrence  of  the  n-tuple 
sequence  {s^}  in  a  sample  of  size  n,  and  m  is  the  total  number  of  different 
sequences  possible  given  the  sampling  constraints.  This  formula  simply 
makes  use  of  the  fact  that  one  may  calculate  the  average  information  in  an 
event  by  calculating  the  average  information  in  a  sequence  of  events 
and  dividing  by  the  number  of  events  in  the  sequence.  In  the  case  of  no 
sampling  constraints,  i.e.,  independent  sampling  on  each  trial,  (3)  and  (4) 
are  equivalent,  but  (3)  is  easier  to  compute.  Hcwever,  when  forcing 
constraints  are  employed,  (3)  is  inappropriate  since  p(x-)  changes  from 
trial  to  trial  as  a  function  of  what  events  have  already  been  selected. 

As  an  example  of  the  possible  outcomes  of  forcing  constraints  on  H,  consider 
the  experiment,  four  successive  tosses  of  an  unbiased  coin.  Since  there  are 
16  possible  sequences  of  heads  and  tails,  each  with  probability  1/16,  the 
uncertainty  associated  with  the  outcome  of  the  experiment  is  four  bits, 
or  one  bit  per  toss.  However,  if  the  experiment  were  constrained  to  insure 
that  the  total  number  of  heads  would  equal  the  total  number  of  tails,  then 
there  would  be  only  six  possible  outcomes,  and  the  experiment  would  now  be 
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worth  not  more  than  2.6  bits,  or  an  average  uncertainty  of  somewhat  less 
than  .7  bits  per  toss.  Note,  however,  that  the  statistic  UG(x))* hen  calculated 
on  the  data  from  such  an  experiment  would  be  insensitive  to  the  forcing 
constraint  and  would  yield  an  average  uncertainty  of  one  bit  per  toss. 

In  general,  HQ(x),  Hp(x)  and  HQ(x)  will  not  be  equal  in  any 
particular  case.  More  specifically,  if  R  (x)  and  Rfx)  are  equal,  R  (x) 

®  r  C 

will  be  different;  if  Hp(x)and  Rq(x)  are  equal,  Hg(x)  will  be  different. 

For  example,  consider  an  experiment  in  which  S  is  told  that  each  of  eight 
stimuli  is  equally  likely  to  occur  on  each  of  64  trials,  when,  in  fact,  the 
sampling  procedure  is  constrained  so  as  to  force  exactly  two  occurrences 
of  each  stimulus  in  each  successive  block  of  16  trials.  In  this  case,  the 
instructions  to  the  subject  imply  that  Hp(x)  is  three  bits,  and  the 
appropriate  calculation  would  show  that  H  (x)  also  is  three  bits;  however, 

O 

because  of  the  sampling  constraint,  R  (x)  is  less  than  3  bits.  If  on  the 

c 

other  hand  the  selection  procedure  is  consistent  with  the  instructions,  and 

the  selection  of  the  stimulus  on  each  trial  is  independent  of  all  previous 

selections,  then  H  (x)  and  H  (x)  both  equal  three  bits,  but  in  general, 

P  ° 

with  such  a  sampling  procedure,  all  x ^  will  not  occur  with  equal  frequency, 

hence,  H„(x)  will  be  less  than  three  bits.  One  of  the  purposes  of  the  use 

of  sampling  constraints  is,  of  course,  to  force  HQ(x)  ,  as  computed  from  a 

stimulus  sample,  to  correspond  exactly  to  tfp/xJ  as  implied  by  the  outcome 

probabilities  stated  to  S .  When  this  is  the  case,  in  general  R  (x)tM  (x) . 

c  p 

4.  A  fourth  use  of  the  concept  has  been  to  connote  an  observers  or 
receiver ys  uncertainty  with  respect  to  which  of  a  set  of  possible  outcomes  or 
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JUJ^UI.  i.  II  PI  T 


~  "--IP. T  ^  A,H- 


message  elements  will  occur.  One's  degree  of  uncertainty  with  respect  to  a 
particular  outcome  is  intuitively  analogous  to  the  degree  to  which  he  would 
be  surprised  by  its  occurrence  -  the  inverse  of  the  degree  to  which  he 
expects  it  to  occur.  An  outcome  which  is  expected  causes  little  surprise  and 
gives  little  information  by  its  occurrence;  whereas  the  occurrence  of  an 
outcome  which  was  considered  unlikely  is  both  surprising  and,  at  least  in 
a  technical  sense,  informative. 

In  this  case  we  say 


n 


l  Pe(xi}  l°92  Pe  (xi) 

i=l 


H  (x) 
e 


(5) 


where  p  (xi)  represents  the  relative  likelihood  or  probability  which  the 
receiver  associates  with  the  occurrence  of  the  outcome  Whereas  one 

might  assume  the  expectancies  of  an  "ideal"  receiver  to  be  consistent  with  the 
available  relevant  information  concerning  the  set  of  alternatives  and  the 
sampling  rules,  pe(x)  represents  the  expectancies  of  a  human  receiver  and 
may  be  biased  by  irrelevancies ,  or  by  unfounded  assumptions  about  probabilistic 
events  that  he  brings  to  the  situation,  e.g. ,  the  so-called  "gambler's 
fallacy"  of  assuming  sequential  dependencies  in  a  series  of  independent  events. 
(We  should  note  that  in  view  of  the  forcing  constraints  that  experimenters 
frequently  impose  on  randomization  procedures,  the  gambler's  fallacy  often 
is  not  so  fallacious  in  the  experimental  situation  as  has  been  supposed.) 

Although  H-(x)  is  the  measure  which  is  most  directly  relevant  to 

w 

questions  of  human  information  processing  capabilities,  it  is  by  far  the 
most  difficult  to  assess.  Certainly  the  assumption  that  9  (x)  corresponds 
exactly  to  either  Hs(x) >  H^fx)  or  is  not  warranted  in  general.  That 
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this  is  so  is  intuitively  clear  from  the  fact  that  different  receivers  may 
gain  different  amounts  of  information  -  may  be  surprised  to  different  degrees 
by  the  occurrence  of  the  same  message  or  outcome.  Cronbnch  (1955)  has  shown 
that  a  receiver  may  gain  more  information  from  a  message  if  his  a  priori 
expectancies  are  in  error  than  if  they  in  fact  are  consistent  with  the 
properties  of  the  source.  Moreover,  one's  ability  to  state  outcome 
probabilities,  or  to  describe  the  process  by  which  E  selects  the  outcomes, 
does  not  reveal  the  nature  of  his  expectancies  on  individual  trials  of  an 
experiment.  It  seems  likely  that  even  in  the  case  of  well  informed  and 
mathematically  sophisticated  individuals  expectancies  may  be  subject  to  trial 
by  trial  variations  resulting  from  idiosyncratic  guessing  strategies,  memory 
limitations,  and  momentary  attention  shifts. 
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or  short  phrases  that  characterize  a  report  and  may  be  used  as 
index  entries  for  cataloging  the  report.  Key  words  must  be 
selected  so  that  no  security  classification  is  required.  Identi¬ 
fiers,  such  as  equipment  model  designation,  trade  name,  military 
project  code  name,  geographic  location,  may  be  used  as  key 
words  but  will  be  followed  by  an  indication  of  technical  con¬ 
text.  The  assignment  of  links,  rules,  and  weights  is  optional 
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